Skip to content

Conversation

@RaphDal
Copy link
Contributor

@RaphDal RaphDal commented Oct 13, 2025

This PR adds decimal support.

This is a tandem pr for:

Usage

Decimal object

from decimal import Decimal

sender.row(
    'trades',
    symbols={
        'symbol': 'ETH-USD',
        'side': 'sell'},
    columns={
        'price': 2615.54,
        'amount': Decimal(0.00044),
        },
    at=TimestampNanos.now())

Progress

  • support binary and text formats
  • update questdb version when decimal is released

Summary by CodeRabbit

  • New Features

    • End-to-end Decimal (fixed‑point) support: ingestion, ILP DECIMAL, Pandas/PyArrow/NumPy paths; public APIs accept Decimal and protocol v3.
  • Documentation

    • Added Decimals datatype mapping, examples, and usage notes (duplicated for parity).
  • Bug Fixes

    • Clarified protocol-version validation/messages and added Decimal-specific error reporting.
  • Tests

    • Extensive decimal serialization/Arrow/Pandas tests and protocol v3 coverage; test fixtures/versions updated.
  • Chores

    • Added compatibility/conversion utilities, CI/workflow tweaks, submodule and dependency updates.

✏️ Tip: You can customize this high-level summary in your review settings.

@RaphDal RaphDal changed the title feat: decimal support feat: support decimal Oct 17, 2025
@RaphDal
Copy link
Contributor Author

RaphDal commented Oct 21, 2025

@CodeRabbit review

@coderabbitai
Copy link

coderabbitai bot commented Oct 21, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai
Copy link

coderabbitai bot commented Oct 21, 2025

Walkthrough

Adds Decimal support across docs, dataframe serialization, mpdecimal compatibility, line-sender APIs and error codes, protocol v3 support, rpyutils big-endian conversion, tests and mock-server updates, CI and build tweaks, and a c-questdb-client submodule update.

Changes

Cohort / File(s) Summary
Submodule
c-questdb-client
Submodule reference updated to a new commit.
Documentation
src/questdb/dataframe.md
Adds Decimals documentation, examples (Pandas / NumPy / PyArrow), supported decimal types and null handling; duplicates section for parity.
Dataframe serialization (Cython)
src/questdb/dataframe.pxi
Adds decimal target/source enums and dispatch codes, col.scale field, byte-swap helpers, Arrow decimal resolution/validation (scale 0–76), Decimal sniffing, and per-variant serializers for Python Decimal and Arrow decimal types; integrates into dispatch.
mpdecimal compatibility (C / Cython)
src/questdb/mpdecimal_compat.h, src/questdb/mpdecimal_compat.pxd
New mpd_t / PyDecObject types, platform typedefs and MPD_RADIX/flags; adds inline helper decimal_pyobj_to_binary to convert Python Decimal → big-endian two's-complement bytes and encoded scale with NaN/Inf and scale validation.
Ingress / Line-sender APIs
src/questdb/ingress.pyi, src/questdb/ingress.pyx, src/questdb/line_sender.pxd
Adds Decimal to Buffer/SenderTransaction row unions, introduces IngressErrorCode.DecimalError and C-code mapping, exposes line_sender_protocol_version_3 and new decimal column functions, adds Buffer._column_decimal dispatch, and expands protocol-version validation/wiring to include v3.
Tests & Mock server
test/test.py, test/test_dataframe.py, test/mock_server.py, test/system_test.py
Adds decimal test constants/helpers and tests (pyobj/Arrow), new protocol-v3 test class, mock-server settings updated to include [1,2,3], and QUESTDB_VERSION / FIRST_DECIMAL_RELEASE updates.
rpyutils / mpd→bigendian
rpyutils/src/mpd_to_bigendian.rs, rpyutils/include/rpyutils.h, rpyutils/Cargo.toml, rpyutils/src/lib.rs
New Rust module implementing mpdecimal-limb → big-endian two's-complement conversion with FFI qdb_mpd_to_bigendian; header exposes function; Cargo adds runtime i256 dependency; lib.rs declares module.
rpyutils Cython binding
src/questdb/rpyutils.pxd
Adds extern declaration for qdb_mpd_to_bigendian to Cython bindings.
Build / setup
setup.py
Passes extension extra_objects as explicit depends to Extension.
CI
ci/run_tests_pipeline.yaml, ci/cibuildwheel.yaml
YAML quoting/format tweaks, displayName adjustments, macOS image bumped to macOS-14.

Sequence Diagram(s)

sequenceDiagram
    participant User as User code
    participant API as Buffer / SenderTransaction
    participant Compat as mpdecimal_compat
    participant Rpy as rpyutils (qdb_mpd_to_bigendian)
    participant Serializer as Line Sender
    participant ILP as ILP Buffer

    User->>API: row(..., columns={'a': Decimal(...)})
    API->>Compat: decimal_pyobj_to_binary(PyDecimal)
    Note right of Compat `#f0f7de`: extract mpd_t (flags, exp, digits)
    alt NaN or Inf
        Compat-->>API: special indicator (empty payload, scale=0)
    else Normal Decimal
        Compat->>Rpy: qdb_mpd_to_bigendian(limbs, len, radix, exp, sign)
        Rpy-->>Compat: (unscaled_bytes, byte_width)
        Compat-->>API: (unscaled_bytes, encoded_scale)
        API->>Serializer: format decimal field (scale, width, bytes)
        Serializer->>ILP: append decimal field to ILP buffer (protocol v1–3)
    end
    ILP-->>User: serialized ILP message
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

  • Areas needing extra attention:
    • src/questdb/mpdecimal_compat.h / src/questdb/mpdecimal_compat.pxd (struct layout, platform typedefs, MPD_RADIX, flags).
    • decimal_pyobj_to_binary path: digit assembly, scale validation, NaN/Inf semantics.
    • src/questdb/dataframe.pxi: new enums/dispatch codes, col.scale usage, Arrow decimal resolution and serializers.
    • FFI and Rust: rpyutils/src/mpd_to_bigendian.rs and rpyutils/include/rpyutils.h (numeric range/overflow, trimming, buffer safety).
    • src/questdb/ingress.pyx / line_sender.pxd: error-code mapping, new C APIs, protocol version 3 wiring.
    • Tests and mock-server defaults verifying protocol/version interactions and decimal test coverage.

Poem

🐰
I nibbled digits, counted scales with care,
Packed bytes in order, big-endian flair.
Arrow and Decimal now hop the stream,
Protocol three widened our scheme.
Precision tucked in every carrot I share.

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed Docstring coverage is 89.29% which is sufficient. The required threshold is 80.00%.
Title check ✅ Passed The title 'feat: decimal support' clearly summarizes the main change - adding Decimal support to the py-questdb-client library, which is the primary focus of the PR across all modified files.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch rd_decimal

Tip

📝 Customizable high-level summaries are now available in beta!

You can now customize how CodeRabbit generates the high-level summary in your pull requests — including its content, structure, tone, and formatting.

  • Provide your own instructions using the high_level_summary_instructions setting.
  • Format the summary however you like (bullet lists, tables, multi-section layouts, contributor stats, etc.).
  • Use high_level_summary_in_walkthrough to move the summary from the description to the walkthrough section.

Example instruction:

"Divide the high-level summary into five sections:

  1. 📝 Description — Summarize the main change in 50–60 words, explaining what was done.
  2. 📓 References — List relevant issues, discussions, documentation, or related PRs.
  3. 📦 Dependencies & Requirements — Mention any new/updated dependencies, environment variable changes, or configuration updates.
  4. 📊 Contributor Summary — Include a Markdown table showing contributions:
    | Contributor | Lines Added | Lines Removed | Files Changed |
  5. ✔️ Additional Notes — Add any extra reviewer context.
    Keep each section concise (under 200 words) and use bullet or numbered lists for clarity."

Note: This feature is currently in beta for Pro-tier users, and pricing will be announced later.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
src/questdb/ingress.pyx (1)

1043-1049: Minor: fix valid-types error message (missing comma concatenates entries)

'datetime.datetime' 'numpy.ndarray' concatenates into one token.

Apply this diff:

-                'TimestampMicros',
-                'datetime.datetime'
-                'numpy.ndarray'))
+                'TimestampMicros',
+                'datetime.datetime',
+                'numpy.ndarray'))
src/questdb/ingress.pyi (1)

1030-1039: Add Decimal to the columns type signature.

The Sender.row method's columns parameter is missing Decimal in its type union, while both Buffer.row (line 386) and SenderTransaction.row (line 207) include it. This inconsistency will cause type checking errors when users try to pass Decimal values to Sender.row.

Apply this diff to add Decimal to the type union:

     columns: Optional[
-            Dict[str, Union[bool, int, float, str, TimestampMicros, datetime, np.ndarray]]
+            Dict[str, Union[bool, int, float, str, TimestampMicros, datetime, np.ndarray, Decimal]]
     ] = None,
🧹 Nitpick comments (6)
src/questdb/ingress.pyx (1)

2414-2429: Doc nit: mention protocol version 3

Property doc explains v1 and v2 only. Consider adding a short note for v3 to avoid confusion.

src/questdb/dataframe.md (1)

96-106: Decimal docs read well

Clear coverage across pandas/NumPy/Arrow with examples; fits new tests. Consider adding the supported scale range (0–76) note here for completeness.

Also applies to: 129-157

src/questdb/ingress.pyi (1)

709-711: Clarify null representation for Decimal columns.

The table shows Y (NaN) for nulls in the Decimal row. However, NaN is typically associated with float types. For Decimal objects, nulls are represented as None or pandas.NA, not NaN. Consider changing this to just Y or Y (None) for clarity.

Apply this diff if you agree:

             * - ``'object'`` (``Decimal`` objects)
-              - Y (``NaN``)
+              - Y
               - ``DECIMAL``
src/questdb/mpdecimal_compat.h (1)

1-19: Document CPython version compatibility assumptions.

This compatibility layer relies on CPython's internal Decimal implementation details (struct layout and limb size). These internals may change between CPython versions. Consider:

  1. Adding a comment documenting which CPython versions are supported (e.g., 3.8+)
  2. Adding runtime checks in the Cython code to verify struct layout hasn't changed
  3. Noting in documentation that this is a best-effort compatibility layer

Example comment to add:

+/* 
+ * Compatibility layer for CPython's decimal module (libmpdec).
+ * Tested with CPython 3.8 through 3.12.
+ * May break with future CPython versions if internal Decimal layout changes.
+ */
+
 /* Determine the limb type used by CPython's libmpdec build. */
 #if SIZE_MAX == UINT64_MAX
src/questdb/dataframe.pxi (2)

59-73: Add comments explaining byte-swap usage for Arrow decimals.

The bswap32 and bswap64 functions are used later for Arrow decimal types (lines 2226, 2245, etc.), but it's not immediately clear why byte-swapping is needed. Arrow stores decimal values in big-endian format, while the ILP protocol expects a specific byte order.

Add a comment explaining the endianness conversion:

+# Arrow decimal types store values in big-endian format (network byte order).
+# These functions convert to the format expected by the ILP protocol.
 cdef inline uint32_t bswap32(uint32_t value):

2213-2295: LGTM! Arrow decimal serialization correctly handles all bit widths.

All four Arrow decimal serialization functions properly:

  • Check Arrow validity bitmaps
  • Send NULL for invalid values
  • Perform correct byte-swapping for endianness conversion
  • Use the stored scale from column metadata

The 128-bit and 256-bit handlers correctly swap both byte order within each 64-bit word and reverse the word order.

Optional: Consider a helper function to reduce duplication.

The four functions have similar structure. You could extract common logic:

cdef void_int _arrow_decimal_to_bytes(
        col_t* col, 
        size_t byte_count,
        uint64_t* out_buffer,
        bint* valid_out) noexcept nogil:
    """Extract and byte-swap Arrow decimal to output buffer."""
    # Common extraction and swapping logic

This would reduce duplication and make maintenance easier, though the current approach is also acceptable.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 996b251 and 1d4c69f.

📒 Files selected for processing (10)
  • c-questdb-client (1 hunks)
  • src/questdb/dataframe.md (1 hunks)
  • src/questdb/dataframe.pxi (17 hunks)
  • src/questdb/ingress.pyi (7 hunks)
  • src/questdb/ingress.pyx (5 hunks)
  • src/questdb/line_sender.pxd (3 hunks)
  • src/questdb/mpdecimal_compat.h (1 hunks)
  • src/questdb/mpdecimal_compat.pxd (1 hunks)
  • test/test.py (4 hunks)
  • test/test_dataframe.py (5 hunks)
🧰 Additional context used
🪛 Clang (14.0.6)
src/questdb/mpdecimal_compat.h

[error] 4-4: 'Python.h' file not found

(clang-diagnostic-error)

🪛 Ruff (0.14.1)
test/test_dataframe.py

90-90: Avoid specifying long messages outside the exception class

(TRY003)


93-93: Avoid specifying long messages outside the exception class

(TRY003)


95-95: Avoid specifying long messages outside the exception class

(TRY003)


97-97: Avoid specifying long messages outside the exception class

(TRY003)


102-103: Avoid specifying long messages outside the exception class

(TRY003)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (15)
  • GitHub Check: questdb.py-questdb-client (1) (cibuildwheel linux_arm64)
  • GitHub Check: questdb.py-questdb-client (1) (cibuildwheel windows_x86_64)
  • GitHub Check: questdb.py-questdb-client (1) (cibuildwheel macos_x64)
  • GitHub Check: questdb.py-questdb-client (1) (cibuildwheel linux_x64_pypy)
  • GitHub Check: questdb.py-questdb-client (1) (cibuildwheel windows_i686)
  • GitHub Check: questdb.py-questdb-client (1) (cibuildwheel linux_x64_cpython_musllinux)
  • GitHub Check: questdb.py-questdb-client (1) (cibuildwheel linux_x64_cpython_manylinux_x86_64)
  • GitHub Check: questdb.py-questdb-client (1) (cibuildwheel start_linux_arm64_agent_aws)
  • GitHub Check: questdb.py-questdb-client (Building and testing on windows-msvc-2019)
  • GitHub Check: questdb.py-questdb-client (Building and testing on mac)
  • GitHub Check: questdb.py-questdb-client (Building and testing on linux-qdb-master)
  • GitHub Check: questdb.py-questdb-client (Building and testing on linux-old-pandas)
  • GitHub Check: questdb.py-questdb-client (Building and testing on linux)
  • GitHub Check: questdb.py-questdb-client (Building and testing TestsAgainstVariousNumpyVersion2x)
  • GitHub Check: questdb.py-questdb-client (Building and testing TestsAgainstVariousNumpyVersion1x)
🔇 Additional comments (29)
c-questdb-client (1)

1-1: Submodule reference update requires verification of C extension changes.

This file contains only a pointer update to the C extension submodule. The actual Decimal support implementation in the C extension (questdb/c-questdb-client) is not accessible for review from this context.

Given that the broader PR adds significant Decimal support across the Python wrapper (dataframe serialization, ILP ingestion, type signatures), ensure that:

  1. The C extension at commit 5b17715... includes corresponding Decimal serialization/deserialization logic.
  2. The binary protocol changes (if any) are compatible with the Python-side changes.
  3. The submodule commit has been tested with the tandem QuestDB core PR (questdb/questdb#6068).

Note: This PR is marked "DO NOT MERGE" and depends on upstream changes.

To verify C extension compatibility, you may want to:

  • Inspect the C extension diff at the target commit to ensure Decimal support aligns with the Python wrapper changes.
  • Confirm that protocol version 3 support (mentioned in the AI summary) is implemented in the C extension.
  • Verify integration tests pass with the updated submodule.
test/test.py (3)

45-49: V3 pandas tests import looks good

Keeps suite discoverable only when pandas is present.


415-425: Protocol-version validation updates are correct

Treating 3 as valid and 4/'4' as invalid with the updated error text matches the new Sender/Buffer checks.

If CI still runs the “unsupported client for V3” test, please confirm it’s updated (or gated) to reflect that the client now supports V3.

Also applies to: 430-432


1478-1479: Public name fix for V2

Renaming to “protocol version 2” is consistent with the class.

test/test_dataframe.py (3)

84-121: Decimal payload helpers are solid

Helpers make intent clear and align with the binary format used in assertions.


570-585: Comprehensive decimal test coverage

Covers pyobj decimals (incl. special values) and Arrow decimals across widths; version-gated appropriately.

Also applies to: 586-597, 598-608, 609-646


1705-1709: Updated error-message regex

The new wording (“Unsupported arrow type …”) matches current behavior.

src/questdb/mpdecimal_compat.pxd (1)

24-71: Decimal → ILP conversion helper looks correct

  • Handles NaN/Inf as nulls.
  • Builds unscaled integer from mpd limbs correctly (LE limbs × MPD_RADIX).
  • Enforces max scale 76 and applies sign.

One note: zero encodes to an empty mantissa (length 0), which matches the tests’ “special values” treatment; confirm your wire format also expects empty mantissa for numeric zero, or adjust to emit a single 0x00 byte.

src/questdb/ingress.pyx (1)

1241-1260: Decimal support is limited to dataframe() path, not row(); verify PR description scope and consider scoping Decimal to dataframe only

The review comment is accurate. After examining the codebase:

  • Buffer.row() columns parameter type hint excludes Decimal (only: bool, int, float, str, TimestampMicros, TimestampNanos, datetime.datetime, numpy.ndarray)
  • Decimal support via decimal_pyobj_to_binary is implemented only in dataframe.pxi
  • No _column_decimal method exists in the Buffer class; only _column_bool, _column_i64, _column_f64, _column_str, _column_ts_micros, _column_ts_nanos, _column_numpy
  • Decimal in dataframe requires protocol v3 (tests skip for version < 3)

If the PR description shows sender.row(... Decimal(...)), the documentation/example is inconsistent with the implementation. Either add Decimal support to row() (requiring _column_decimal and protocol v3 guard) or scope examples/docs to dataframe-only.

src/questdb/line_sender.pxd (2)

43-56: LGTM! Protocol version and error code additions are well-structured.

The addition of line_sender_error_invalid_decimal and line_sender_protocol_version_3 follows existing conventions and provides the necessary foundation for Decimal support.


268-282: LGTM! Decimal buffer functions follow established patterns.

The two new functions line_sender_buffer_column_dec_str and line_sender_buffer_column_dec are well-designed:

  • Consistent with existing column buffer functions
  • Support both text (string) and binary formats
  • Include proper error handling via err_out parameter
src/questdb/ingress.pyi (4)

43-61: LGTM! Import and error code additions are correct.

The import of Decimal and the addition of DecimalError to the IngressErrorCode enum are necessary for type checking support.


207-207: LGTM! Type signature correctly includes Decimal.

The addition of Decimal to the columns parameter type union in SenderTransaction.row enables proper type checking for decimal column values.


386-386: LGTM! Type signature correctly includes Decimal.

The addition of Decimal to the columns parameter type union in Buffer.row enables proper type checking for decimal column values.


407-456: LGTM! Documentation clearly illustrates Decimal usage.

The example usage and type mapping table additions help users understand:

  • How to pass Decimal values in the columns dict
  • The mapping from Python Decimal to ILP DECIMAL type
src/questdb/mpdecimal_compat.h (3)

21-35: Add runtime validation for struct layout assumptions.

The mpd_t and PyDecObject struct definitions assume a specific memory layout that matches CPython's internal implementation. If CPython changes these internals, this code will silently produce incorrect results or crash.

Consider adding runtime checks in the Cython initialization code (e.g., in mpdecimal_compat.pxd or module init) to verify:

  1. Size of Python Decimal objects matches expectations
  2. Basic sanity checks on extracted values (e.g., comparing against decimal module's official API)

Example validation approach:

# At module initialization
test_decimal = Decimal("123.45")
# Extract using compatibility layer
# Also extract using official decimal API
# Assert they match

This would catch breaking changes early rather than producing silent corruption.


37-44: LGTM! Accessor functions correctly handle inline vs heap storage.

The decimal_digits() function properly handles both storage modes:

  • Heap-allocated: uses dec->dec.data
  • Inline (small decimals): uses dec->data[4]

This matches CPython's optimization for small decimal values.


46-54: LGTM! Flag definitions match mpdecimal constants.

The flag enum and MPD_RADIX constant definitions are correct and consistent with libmpdec's public interface.

src/questdb/dataframe.pxi (11)

96-110: LGTM! Enum additions for decimal target are correct.

The addition of col_target_column_decimal = 9 and updating col_target_at = 10 maintains the enum sequence. The target name "decimal" in _TARGET_NAMES is consistent with other entries.


152-179: LGTM! Decimal source types cover all supported formats.

The five decimal source types provide comprehensive coverage:

  • col_source_decimal_pyobj: Python Decimal objects
  • col_source_decimal32/64/128/256_arrow: Arrow decimal types of different bit widths

The inclusion in _PYOBJ_SOURCE_DESCR enables clear error messages.


249-272: LGTM! Target-to-source mappings are complete.

The _TARGET_TO_SOURCES mapping correctly includes all five decimal source types for the col_target_column_decimal target. The addition to _FIELD_TARGETS ensures decimal columns are recognized as field columns.


397-406: LGTM! Dispatch codes follow established patterns.

The five dispatch codes combining col_target_column_decimal with each decimal source type enable efficient routing in the serialization switch statement. This follows the same pattern used for other column types.


427-432: LGTM! Scale field addition is well-documented.

The scale field in col_t correctly stores the decimal scale for Arrow types. The comment clearly indicates it's only used for Arrow decimals and defaults to 0. uint8_t is sufficient for the 0-76 scale range.


956-979: LGTM! Arrow decimal type resolution is comprehensive.

The _dataframe_series_resolve_arrow function correctly:

  • Handles all four Arrow decimal bit widths (32/64/128/256)
  • Validates scale is within the supported 0-76 range
  • Provides clear error messages with GitHub issue link
  • Stores the scale for later use in serialization

1046-1047: LGTM! Decimal detection in object columns is correct.

The isinstance(<object>obj, Decimal) check properly identifies Decimal objects in pandas object-dtype columns and sets the appropriate source type. This is consistent with other type detection logic in _dataframe_series_sniff_pyobj.


1163-1164: LGTM! ArrowDtype support enables Arrow-backed decimal columns.

The handling of _PANDAS.ArrowDtype by delegating to _dataframe_series_resolve_arrow correctly enables support for Arrow-backed columns in pandas, including Arrow decimal types.


1300-1315: LGTM! Formatting improves readability of bitwise operations.

The reformatting of the bitwise validity checks with explicit parentheses and line breaks makes the bit manipulation logic clearer without changing behavior.


2171-2210: LGTM! Python Decimal serialization handles all cases correctly.

The serialization function properly handles:

  • Null values (returns early)
  • Special values like NaN/Inf (sends as NULL to server)
  • Mantissa size validation (127-byte limit)
  • Error reporting via IngressError

The use of decimal_pyobj_to_binary encapsulates the complex decimal-to-binary conversion logic.


2456-2465: LGTM! Dispatch switch correctly routes all decimal types.

The five decimal dispatch cases properly route to their respective serialization functions. The GIL handling is correct:

  • decimal_pyobj doesn't pass gs (requires GIL)
  • Arrow variants pass gs (can release GIL)

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
src/questdb/dataframe.pxi (1)

972-977: Verify the decimal scale range (0-76) is correct.

The code validates that the Arrow decimal scale is between 0 and 76 inclusive. Please confirm this range matches QuestDB's decimal column constraints and is documented in the database's decimal support specifications.

If this range is database-specific, consider adding a constant or comment referencing the specification:

+    # QuestDB decimal scale range: 0-76 (see: https://questdb.io/docs/reference/sql/datatypes/#decimal)
     if arrowtype.scale < 0 or arrowtype.scale > 76:
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1d4c69f and b62520a.

📒 Files selected for processing (1)
  • src/questdb/dataframe.pxi (17 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-10-22T07:31:37.016Z
Learnt from: RaphDal
Repo: questdb/py-questdb-client PR: 114
File: src/questdb/ingress.pyx:192-195
Timestamp: 2025-10-22T07:31:37.016Z
Learning: In src/questdb/ingress.pyx, the IngressErrorCode.BadDataFrame enum member is Python-internal only and does not map to any C error code from the underlying library. It is defined as `<int>line_sender_error_invalid_decimal + 1` and should not be added to the c_err_code_to_py function, which only handles C-to-Python error code translation.

Applied to files:

  • src/questdb/dataframe.pxi
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
  • GitHub Check: questdb.py-questdb-client (1) (cibuildwheel windows_x86_64)
  • GitHub Check: questdb.py-questdb-client (1) (cibuildwheel macos_x64)
  • GitHub Check: questdb.py-questdb-client (1) (cibuildwheel windows_i686)
  • GitHub Check: questdb.py-questdb-client (1) (cibuildwheel linux_x64_pypy)
  • GitHub Check: questdb.py-questdb-client (1) (cibuildwheel linux_x64_cpython_manylinux_x86_64)
  • GitHub Check: questdb.py-questdb-client (1) (cibuildwheel linux_x64_cpython_musllinux)
  • GitHub Check: questdb.py-questdb-client (1) (cibuildwheel start_linux_arm64_agent_aws)
  • GitHub Check: questdb.py-questdb-client (Building and testing on windows-msvc-2019)
  • GitHub Check: questdb.py-questdb-client (Building and testing on mac)
  • GitHub Check: questdb.py-questdb-client (Building and testing on linux-qdb-master)
  • GitHub Check: questdb.py-questdb-client (Building and testing on linux-old-pandas)
  • GitHub Check: questdb.py-questdb-client (Building and testing on linux)
  • GitHub Check: questdb.py-questdb-client (Building and testing TestsAgainstVariousNumpyVersion2x)
  • GitHub Check: questdb.py-questdb-client (Building and testing TestsAgainstVariousNumpyVersion1x)
🔇 Additional comments (6)
src/questdb/dataframe.pxi (6)

59-73: Byte-swap utilities look correct.

The inline bswap32 and bswap64 functions implement standard endianness conversion. These are essential for converting Arrow's little-endian decimal representations to the big-endian format expected by the ILP protocol.

Ensure the byte-swapping operations are thoroughly tested, especially for the complex multi-chunk swaps used in decimal128 and decimal256 serialization (lines 2265-2267, 2287-2291).


96-97: Decimal enum and mapping additions look correct.

The new col_target_column_decimal target, associated sources, and dispatch codes are consistently integrated into the existing enum and mapping structures. The col_target_at offset correctly shifted from 9 to 10.

Also applies to: 110-110, 179-179, 249-255, 271-272, 397-406


432-432: Scale field addition is appropriate.

The scale field is used to store the Arrow decimal type's scale (set during resolution on line 978) and is unused for non-decimal columns. For Python Decimal objects, the scale is extracted during serialization from the object itself.


1046-1047: Decimal type detection is well-integrated.

The decimal sniffing in object columns (lines 1046-1047) and ArrowDtype handling (lines 1163-1164) follow the established patterns and integrate cleanly with existing type detection logic.

Also applies to: 1163-1164


2456-2465: Decimal dispatch integration looks correct.

The dispatch logic correctly routes each decimal dispatch code to its corresponding serialization function, following the established pattern for other column types.


2171-2296: Verify decimal mantissa size limit and byte-swapping correctness.

The decimal serialization implementation has well-tested code paths for Python Decimals and Arrow decimal types, with proper special value handling (NaN/Infinity produce NULL markers). However:

  1. Undocumented 127-byte mantissa limit (line 2197): The hardcoded limit lacks any explanation. Verify this aligns with QuestDB's ILP wire format specification—it appears reasonable for ~307 decimal digits but needs confirmation it's not arbitrary or outdated.

  2. Complex byte-swapping for multi-chunk decimals (decimal128 at lines 2265–2267, decimal256 at lines 2287–2291): The chunk reversal and byte-swap logic is correct in principle but should be validated against actual serialized output, especially endianness correctness across platforms.

  3. No comments on pointer arithmetic: The bit-shift indexing (offset << 1, offset << 2) is correct but could benefit from a brief inline comment explaining the multi-chunk array indexing.

Confirm:

  • 127-byte limit matches QuestDB protocol constraints
  • Byte-swap output is verified correct via integration testing
  • Logic handles edge cases (min/max values, overflow near 127-byte boundary)

RaphDal and others added 3 commits November 7, 2025 09:33
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
test/test.py (1)

437-437: Remove debug print statement.

This print statement appears to be a debug artifact that should be removed before merging.

Apply this diff:

-                    print("Testing bad protocol version:", version)
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5d7ee5e and 5b4cbfa.

📒 Files selected for processing (1)
  • test/test.py (7 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
test/test.py (2)
test/test_dataframe.py (1)
  • TestPandasProtocolVersionV3 (1806-1808)
test/mock_server.py (2)
  • HttpServer (124-211)
  • port (210-211)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (15)
  • GitHub Check: questdb.py-questdb-client (1) (cibuildwheel linux_arm64)
  • GitHub Check: questdb.py-questdb-client (1) (cibuildwheel windows_x86_64)
  • GitHub Check: questdb.py-questdb-client (1) (cibuildwheel start_linux_arm64_agent_aws)
  • GitHub Check: questdb.py-questdb-client (1) (cibuildwheel windows_i686)
  • GitHub Check: questdb.py-questdb-client (1) (cibuildwheel linux_x64_cpython_manylinux_x86_64)
  • GitHub Check: questdb.py-questdb-client (1) (cibuildwheel linux_x64_cpython_musllinux)
  • GitHub Check: questdb.py-questdb-client (1) (cibuildwheel macos_x64)
  • GitHub Check: questdb.py-questdb-client (1) (cibuildwheel linux_x64_pypy)
  • GitHub Check: questdb.py-questdb-client (Building and testing on windows-msvc-2019)
  • GitHub Check: questdb.py-questdb-client (Building and testing on mac)
  • GitHub Check: questdb.py-questdb-client (Building and testing on linux-qdb-master)
  • GitHub Check: questdb.py-questdb-client (Building and testing on linux-old-pandas)
  • GitHub Check: questdb.py-questdb-client (Building and testing TestsAgainstVariousNumpyVersion1x)
  • GitHub Check: questdb.py-questdb-client (Building and testing on linux)
  • GitHub Check: questdb.py-questdb-client (Building and testing TestsAgainstVariousNumpyVersion2x)
🔇 Additional comments (6)
test/test.py (6)

27-27: LGTM: Import updated for expanded protocol version coverage.

The import correctly reflects the expanded protocol version support (v1, v2, v3, and v4), aligning with changes in mock_server.py.


48-48: LGTM: Protocol version v3 test class import.

The import of TestPandasProtocolVersionV3 is consistent with the existing pattern for v1 and v2, and is properly conditional on pandas availability.


420-421: LGTM: Bad protocol version test correctly updated.

The test now correctly validates that version 4 is invalid (since only versions 1-3 are supported), and the error message accurately reflects the valid range.

Also applies to: 430-431


1152-1153: LGTM: HTTP auto protocol version test extended to v3.

The test correctly validates that when the server supports v1, v2, and v3, the client auto-negotiates to the highest version (v3).


1176-1177: LGTM: Unsupported protocol version test updated.

The test now correctly validates the scenario where the server only supports v4 (which the client doesn't support), ensuring proper forward-compatibility error handling.


1479-1479: LGTM: Bug fix for incorrect test class name.

The name attribute now correctly reflects that this test class is for protocol version 2, not version 1.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9b00b39 and cad427c.

📒 Files selected for processing (1)
  • test/system_test.py (1 hunks)

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
test/system_test.py (1)

149-151: Use string variant for “_str” test.

test_auth_tls_insecure_skip_verify_str passes a bool; use the string form to match the test’s intent.

-        self._test_scenario(self.qdb_auth, 'tcps', tls_verify=False, **AUTH)
+        self._test_scenario(self.qdb_auth, 'tcps', tls_verify='unsafe_off', **AUTH)
src/questdb/ingress.pyx (1)

1048-1057: Fix error message list; add missing comma and include Decimal/TimestampNanos.

Current message concatenates 'datetime.datetime' 'numpy.ndarray' and omits new types.

-            valid = ', '.join((
-                'bool',
-                'int',
-                'float',
-                'str',
-                'TimestampMicros',
-                'datetime.datetime'
-                'numpy.ndarray'))
+            valid = ', '.join((
+                'bool',
+                'int',
+                'float',
+                'str',
+                'TimestampMicros',
+                'TimestampNanos',
+                'datetime.datetime',
+                'numpy.ndarray',
+                'Decimal'))
♻️ Duplicate comments (2)
test/system_test.py (1)

222-222: Fix timestamp type threshold: use < (9, 1, 0), not <=.

9.1.0 should use TIMESTAMP_NS; current check treats it as pre‑9.1.0.

-exp_ts_type = 'TIMESTAMP' if self.qdb_plain.version <= (9, 1, 0) else 'TIMESTAMP_NS'
+exp_ts_type = 'TIMESTAMP' if self.qdb_plain.version < (9, 1, 0) else 'TIMESTAMP_NS'
src/questdb/dataframe.pxi (1)

152-157: Remove unnecessary GIL requirement for Arrow decimal sources.

Arrow decimal serializers run nogil except error paths; marking them GIL-required degrades perf for whole DF serialization.

-    col_source_decimal_pyobj =          701100
-    col_source_decimal32_arrow =        702100
-    col_source_decimal64_arrow =        703100
-    col_source_decimal128_arrow =       704100
-    col_source_decimal256_arrow =       705100
+    col_source_decimal_pyobj =          701100
+    col_source_decimal32_arrow =        702000
+    col_source_decimal64_arrow =        703000
+    col_source_decimal128_arrow =       704000
+    col_source_decimal256_arrow =       705000
🧹 Nitpick comments (1)
ci/run_tests_pipeline.yaml (1)

85-91: Temporary decimal-branch test block: ensure removal before merge.

The comment says “Remove before merging decimal support PR”. Please gate or remove this block prior to merge into main.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between cad427c and 3a57ea4.

📒 Files selected for processing (5)
  • ci/run_tests_pipeline.yaml (4 hunks)
  • src/questdb/dataframe.pxi (17 hunks)
  • src/questdb/ingress.pyx (8 hunks)
  • src/questdb/mpdecimal_compat.pxd (1 hunks)
  • test/system_test.py (5 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-10-22T07:31:37.016Z
Learnt from: RaphDal
Repo: questdb/py-questdb-client PR: 114
File: src/questdb/ingress.pyx:192-195
Timestamp: 2025-10-22T07:31:37.016Z
Learning: In src/questdb/ingress.pyx, the IngressErrorCode.BadDataFrame enum member is Python-internal only and does not map to any C error code from the underlying library. It is defined as `<int>line_sender_error_invalid_decimal + 1` and should not be added to the c_err_code_to_py function, which only handles C-to-Python error code translation.

Applied to files:

  • src/questdb/ingress.pyx
  • src/questdb/dataframe.pxi
  • src/questdb/mpdecimal_compat.pxd
🧬 Code graph analysis (1)
test/system_test.py (1)
src/questdb/ingress.pyi (7)
  • Sender (821-1153)
  • row (202-217)
  • row (380-485)
  • row (1030-1049)
  • dataframe (219-230)
  • dataframe (487-769)
  • dataframe (1051-1099)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (15)
  • GitHub Check: questdb.py-questdb-client (1) (cibuildwheel linux_arm64)
  • GitHub Check: questdb.py-questdb-client (1) (cibuildwheel start_linux_arm64_agent_aws)
  • GitHub Check: questdb.py-questdb-client (1) (cibuildwheel windows_x86_64)
  • GitHub Check: questdb.py-questdb-client (1) (cibuildwheel windows_i686)
  • GitHub Check: questdb.py-questdb-client (1) (cibuildwheel macos_x64)
  • GitHub Check: questdb.py-questdb-client (1) (cibuildwheel linux_x64_cpython_musllinux)
  • GitHub Check: questdb.py-questdb-client (1) (cibuildwheel linux_x64_pypy)
  • GitHub Check: questdb.py-questdb-client (1) (cibuildwheel linux_x64_cpython_manylinux_x86_64)
  • GitHub Check: questdb.py-questdb-client (Building and testing on windows-msvc-2019)
  • GitHub Check: questdb.py-questdb-client (Building and testing on mac)
  • GitHub Check: questdb.py-questdb-client (Building and testing on linux-qdb-master)
  • GitHub Check: questdb.py-questdb-client (Building and testing on linux-old-pandas)
  • GitHub Check: questdb.py-questdb-client (Building and testing on linux)
  • GitHub Check: questdb.py-questdb-client (Building and testing TestsAgainstVariousNumpyVersion2x)
  • GitHub Check: questdb.py-questdb-client (Building and testing TestsAgainstVariousNumpyVersion1x)
🔇 Additional comments (9)
test/system_test.py (4)

32-32: Version pin looks fine; remember to bump for release.

Update QUESTDB_VERSION to the first release with decimals before merging this PR.


36-36: Confirm decimal feature floor.

Ensure FIRST_DECIMAL_RELEASE = (9, 2, 0) matches the server release that first exposes DECIMAL. Adjust if the final release differs.


273-293: Decimal (python object) test looks good. Verify type name stability.

Expected column type DECIMAL(18,3) matches current behavior; double‑check it won’t change across target server versions.


294-327: Arrow decimal: expected scale vs dtype scale.

DataFrame uses Arrow dtype with scale 2, but asserts DECIMAL(18,3) and values with 3 fractional digits. Confirm server’s canonicalization to scale=3 is intended and stable; otherwise, consider making the expectation reflect Arrow scale.

ci/run_tests_pipeline.yaml (1)

82-83: Env wiring for master tests looks correct.

src/questdb/ingress.pyx (2)

145-147: BadDataFrame computed offset is intentional—OK.

Design keeps Python-only code after last C enum. No C mapping needed.

Based on learnings


830-834: Protocol v3: validation message and range check look good.

src/questdb/dataframe.pxi (2)

956-979: Decimal serialization paths look correct. Two sanity notes.

  • Arrow endianness handling and sign-propagation via two’s complement look fine.
  • Python Decimal path delegates to decimal_pyobj_to_binary and respects the 127‑byte mantissa cap; with the mpdecimal fix (see separate comment), this avoids false rejections.

If available, add a negative Arrow decimal case in tests to exercise sign handling for 32/64/128/256.

Also applies to: 2170-2202, 2217-2300


1301-1316: Bitmask helpers: correctness looks good.

Validity and bit extraction logic are correct and match Arrow layout.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
test/system_test.py (1)

81-83: Document the version override mechanism.

The environment variable TEST_QUESTDB_INTEGRATION_FORCE_MAX_VERSION allows testing pre-release features (like decimal support) against development builds of QuestDB. This is useful for tandem PR development but should be documented.

Consider adding a comment explaining:

  • The purpose of this override
  • When it should be used (e.g., testing against QuestDB builds from source)
  • That it's intended for development/CI, not production use
+        # Allow testing pre-release features against development QuestDB builds
+        # Set TEST_QUESTDB_INTEGRATION_FORCE_MAX_VERSION=1 to bypass version checks
         if os.environ.get('TEST_QUESTDB_INTEGRATION_FORCE_MAX_VERSION') == '1':
             cls.qdb_plain.version = (999, 999, 999)
             cls.qdb_auth.version = (999, 999, 999)
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ac4350a and 8c3b485.

📒 Files selected for processing (2)
  • c-questdb-client (1 hunks)
  • test/system_test.py (5 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • c-questdb-client
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: RaphDal
Repo: questdb/py-questdb-client PR: 114
File: test/system_test.py:317-324
Timestamp: 2025-11-07T15:46:43.284Z
Learning: In QuestDB, when decimal values are ingested, the server creates DECIMAL columns with a default precision of 18 and scale of 3, regardless of the scale declared in the client-side data type (e.g., pyarrow.decimal128(18, 2)). Values are stored and returned with 3 decimal places.
📚 Learning: 2025-11-07T15:46:43.284Z
Learnt from: RaphDal
Repo: questdb/py-questdb-client PR: 114
File: test/system_test.py:317-324
Timestamp: 2025-11-07T15:46:43.284Z
Learning: In QuestDB, when decimal values are ingested, the server creates DECIMAL columns with a default precision of 18 and scale of 3, regardless of the scale declared in the client-side data type (e.g., pyarrow.decimal128(18, 2)). Values are stored and returned with 3 decimal places.

Applied to files:

  • test/system_test.py
🧬 Code graph analysis (1)
test/system_test.py (1)
src/questdb/ingress.pyi (7)
  • Sender (821-1153)
  • row (202-217)
  • row (380-485)
  • row (1030-1049)
  • dataframe (219-230)
  • dataframe (487-769)
  • dataframe (1051-1099)
🔇 Additional comments (6)
test/system_test.py (6)

11-11: LGTM!

The decimal import is correctly added at the module level and is necessary for the new decimal test cases.


26-26: LGTM!

Setting pyarrow = None in the except block properly addresses the past review feedback about guarding the pyarrow import. This allows the @skipIf(not pyarrow, ...) decorators to function correctly when pyarrow is not installed.


222-222: LGTM!

The version comparison fix correctly uses < instead of <=, ensuring that QuestDB 9.1.0 and later versions properly return TIMESTAMP_NS while earlier versions return TIMESTAMP. This addresses the past review feedback.


273-295: LGTM!

The test_decimal_py_obj test is well-structured:

  • Properly gated by the FIRST_DECIMAL_RELEASE version check
  • Creates a table with explicit DECIMAL(18,3) schema
  • Correctly validates both column metadata and inserted values
  • Aligns with QuestDB's decimal handling

297-333: LGTM!

The test_decimal_pyarrow test correctly validates decimal handling through pandas DataFrames with pyarrow backing:

  • Properly decorated with @skipIf for optional dependencies
  • Version-gated by FIRST_DECIMAL_RELEASE
  • The test expects DECIMAL(18,3) and values with 3 decimal places even though the DataFrame uses pyarrow.decimal128(18, 2) (scale 2), which correctly reflects QuestDB's default behavior of storing decimals with scale 3

Based on learnings


32-36: Version availability confirmed; code is correct.

QuestDB version 9.1.1 exists and 9.2.0 was released on November 13, 2025. The version strategy in the code is sound:

  • QUESTDB_VERSION = '9.1.1' references an available release
  • FIRST_DECIMAL_RELEASE = (9, 2, 0) aligns with the decimal feature now being released
  • Decimal tests will correctly activate when QuestDB 9.2.0+ is installed

The code requires no changes. The "DO NOT MERGE" label appears intentional, likely held pending the decimal release, which has now occurred.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8d70bb2 and 1cfca36.

📒 Files selected for processing (1)
  • src/questdb/mpdecimal_compat.h (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: RaphDal
Repo: questdb/py-questdb-client PR: 114
File: test/system_test.py:317-324
Timestamp: 2025-11-07T15:46:43.284Z
Learning: In QuestDB, when decimal values are ingested, the server creates DECIMAL columns with a default precision of 18 and scale of 3, regardless of the scale declared in the client-side data type (e.g., pyarrow.decimal128(18, 2)). Values are stored and returned with 3 decimal places.
🪛 Clang (14.0.6)
src/questdb/mpdecimal_compat.h

[error] 3-3: 'Python.h' file not found

(clang-diagnostic-error)

🔇 Additional comments (5)
src/questdb/mpdecimal_compat.h (5)

7-18: Platform detection logic is correct.

The conditional typedef based on SIZE_MAX correctly distinguishes 32-bit and 64-bit platforms, and the corresponding MPD_RADIX_CONST values (10^19 and 10^9) match libmpdec's implementation.


36-38: Inline accessor is correct.

The cast and field access are straightforward. The lack of null checking is acceptable for this low-level accessor, as callers are expected to validate inputs.


40-43: Correctly handles inline vs. heap-allocated decimal data.

The conditional return properly distinguishes between small decimals using the inline data[4] array and larger decimals requiring heap allocation via dec->dec.data.


45-51: Flag definitions are standard and correct.

The enum values match libmpdec's flag definitions, and MPD_FLAG_SPECIAL_MASK provides a convenient way to check for any special value (infinity or NaN).


53-53: LGTM!

Correct initialization of the radix constant from the platform-specific macro.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
test/system_test.py (1)

293-328: test_decimal_pyarrow correctly validates Arrow-backed DECIMAL behavior.

The test combines pyarrow and pandas to send a decimal column with Arrow scale 2, while asserting a DECIMAL(18,3) column and 3‑decimal string values (-99999.990, -678.000, None), which aligns with QuestDB’s default DECIMAL(18,3) storage semantics. You might optionally set min_rows=3 in retry_check_table to mirror the number of inserted rows, but functionally this is fine as‑is.

Based on learnings

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1cfca36 and f0d6987.

📒 Files selected for processing (1)
  • test/system_test.py (4 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: RaphDal
Repo: questdb/py-questdb-client PR: 114
File: test/system_test.py:317-324
Timestamp: 2025-11-07T15:46:43.284Z
Learning: In QuestDB, when decimal values are ingested, the server creates DECIMAL columns with a default precision of 18 and scale of 3, regardless of the scale declared in the client-side data type (e.g., pyarrow.decimal128(18, 2)). Values are stored and returned with 3 decimal places.
📚 Learning: 2025-11-07T15:46:43.284Z
Learnt from: RaphDal
Repo: questdb/py-questdb-client PR: 114
File: test/system_test.py:317-324
Timestamp: 2025-11-07T15:46:43.284Z
Learning: In QuestDB, when decimal values are ingested, the server creates DECIMAL columns with a default precision of 18 and scale of 3, regardless of the scale declared in the client-side data type (e.g., pyarrow.decimal128(18, 2)). Values are stored and returned with 3 decimal places.

Applied to files:

  • test/system_test.py
🧬 Code graph analysis (1)
test/system_test.py (1)
src/questdb/ingress.pyi (7)
  • Sender (821-1153)
  • row (202-217)
  • row (380-485)
  • row (1030-1049)
  • dataframe (219-230)
  • dataframe (487-769)
  • dataframe (1051-1099)
🔇 Additional comments (5)
test/system_test.py (5)

11-26: Imports and optional dependency handling look correct.

Using a plain decimal import plus guarding pandas/pyarrow in a try/except and setting pyarrow = None integrates cleanly with the @skipIf(not pyarrow)/@skipIf(not pd) decorators and avoids premature ImportError when these extras are absent.


32-37: Version bump and decimal feature gate are consistent.

Updating QUESTDB_VERSION to '9.2.0' and introducing FIRST_DECIMAL_RELEASE = (9, 2, 0) provides an explicit guard for decimal tests and matches the PR objective to run them only once decimal support is available.


218-226: Timestamp type version check now matches TIMESTAMP_NS introduction.

The exp_ts_type condition using < (9, 1, 0) correctly treats 9.1.0 and newer servers as TIMESTAMP_NS, resolving the previous off‑by‑one behavior.


269-291: test_decimal_py_obj gives solid coverage for direct Decimal ingestion.

The test exercises HTTP ILP ingestion of a DECIMAL(18,3) column from a Python Decimal, verifies the column metadata, and asserts the stored value, all gated behind FIRST_DECIMAL_RELEASE so it won’t run on older servers.


331-332: __main__ unittest entrypoint is fine.

Keeping unittest.main() under the usual guard is standard and harmless for this test module.

Copy link
Collaborator

@amunra amunra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Outstanding changes required:

  • Clean-up of mpd binary buffer extraction.
  • A few more decimal.Decimal tests.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
test/test_dataframe.py (1)

609-653: Arrow decimal tests look correct; still missing explicit boundary coverage

The Arrow decimal tests validate all four widths, null handling, and payload layout (scale + byte_width + big‑endian two’s‑complement mantissa) against _decimal_binary_payload, which matches the new Cython serializers. What’s still missing, per the earlier review, is explicit coverage of values at or very near the representable limits for each decimal width (both positive and negative), to fully exercise the mpdecimal extraction and overflow behavior. Consider extending arrow_cases with boundary‑case mantissas for each arrow_type.

🧹 Nitpick comments (1)
test/test_dataframe.py (1)

84-121: Decimal payload helpers match ILP layout; messages are fine for tests

The constant and helpers _decode_decimal_payload, _decimal_from_unscaled, and _decimal_binary_payload correctly mirror the binary decimal layout (marker, format type, scale, width, mantissa) and are well‑suited for asserting encoder behavior. Only minor note: Ruff’s TRY003 warnings about long messages can be ignored here since these are test‑only assertions, but if you want zero warnings you could factor the repeated messages into small helper functions.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f0d6987 and 9db429f.

📒 Files selected for processing (2)
  • src/questdb/dataframe.pxi (17 hunks)
  • test/test_dataframe.py (5 hunks)
🧰 Additional context used
🧠 Learnings (3)
📓 Common learnings
Learnt from: RaphDal
Repo: questdb/py-questdb-client PR: 114
File: test/system_test.py:317-324
Timestamp: 2025-11-07T15:46:43.284Z
Learning: In QuestDB, when decimal values are ingested, the server creates DECIMAL columns with a default precision of 18 and scale of 3, regardless of the scale declared in the client-side data type (e.g., pyarrow.decimal128(18, 2)). Values are stored and returned with 3 decimal places.
📚 Learning: 2025-11-07T15:46:43.284Z
Learnt from: RaphDal
Repo: questdb/py-questdb-client PR: 114
File: test/system_test.py:317-324
Timestamp: 2025-11-07T15:46:43.284Z
Learning: In QuestDB, when decimal values are ingested, the server creates DECIMAL columns with a default precision of 18 and scale of 3, regardless of the scale declared in the client-side data type (e.g., pyarrow.decimal128(18, 2)). Values are stored and returned with 3 decimal places.

Applied to files:

  • test/test_dataframe.py
  • src/questdb/dataframe.pxi
📚 Learning: 2025-10-22T07:31:37.016Z
Learnt from: RaphDal
Repo: questdb/py-questdb-client PR: 114
File: src/questdb/ingress.pyx:192-195
Timestamp: 2025-10-22T07:31:37.016Z
Learning: In src/questdb/ingress.pyx, the IngressErrorCode.BadDataFrame enum member is Python-internal only and does not map to any C error code from the underlying library. It is defined as `<int>line_sender_error_invalid_decimal + 1` and should not be added to the c_err_code_to_py function, which only handles C-to-Python error code translation.

Applied to files:

  • src/questdb/dataframe.pxi
🧬 Code graph analysis (1)
test/test_dataframe.py (1)
src/questdb/ingress.pyi (1)
  • IngressError (65-70)
🪛 Ruff (0.14.5)
test/test_dataframe.py

90-90: Avoid specifying long messages outside the exception class

(TRY003)


93-93: Avoid specifying long messages outside the exception class

(TRY003)


95-95: Avoid specifying long messages outside the exception class

(TRY003)


97-97: Avoid specifying long messages outside the exception class

(TRY003)


102-103: Avoid specifying long messages outside the exception class

(TRY003)

🔇 Additional comments (7)
test/test_dataframe.py (3)

570-608: Good coverage of pyobj Decimal values, normalization, and specials

The three tests (test_decimal_pyobj_column, test_decimal_pyobj_trailing_zeros_and_integer, test_decimal_pyobj_special_values) give solid coverage: protocol‑version gating, numeric equality from decoded (scale, mantissa), canonicalization of trailing zeros, and special values encoded as empty mantissas with scale 0. This aligns well with the new decimal_pyobj_to_binary path in the dataframe serializer.


1714-1714: Updated ArrowDtype error message aligns with new resolver

The regex change to match Unsupported arrow type int16 for column 'a'... is consistent with _dataframe_series_resolve_arrow, which now reports unsupported ArrowDtype types in that form. This keeps tests and error reporting in sync.


1813-1816: Protocol v3 test class wiring is correct

TestPandasProtocolVersionV3 mirrors the v1/v2 classes, and the new decimal tests are correctly gated on self.version >= 3, so v3 coverage is enabled without affecting older protocol versions.

src/questdb/dataframe.pxi (4)

3-7: Decimal target/source wiring and scale field are consistent

The new decimal target (col_target_column_decimal = 9), its name mapping, decimal sources (pyobj + 4 Arrow widths), _TARGET_TO_SOURCES entry, _FIELD_TARGETS inclusion, and corresponding col_dispatch_code_t entries are all internally consistent. col_source_decimal_pyobj is correctly marked as GIL‑requiring (hundreds digit = 1) while Arrow decimal sources are not, matching the serialization pattern and allowing the main loop to drop the GIL when only Arrow decimals are present. The added scale field on col_t is zero‑initialized via calloc and only populated for Arrow decimal sources, which is exactly how the serializers consume it.

Also applies to: 86-111, 115-157, 174-181, 183-260, 263-273, 287-407, 427-433


1300-1315: Minor bitmap helpers change is purely stylistic

The tweaks in _dataframe_arrow_get_bool and _dataframe_arrow_is_valid only reformat the bitmask expression and add parentheses; semantics stay identical and are in line with the rest of the Arrow bitmap usage.


2170-2215: Pyobj Decimal serializer is correct and enforces sensible limits

serialize_decimal_py_obj cleanly delegates to decimal_pyobj_to_binary, handles special values via a mantissa is None convention, enforces the 127‑byte mantissa limit before calling line_sender_buffer_column_dec, and properly converts the returned bytes to a const uint8_t*. _dataframe_serialize_cell_column_decimal__decimal_pyobj respects the existing null semantics (_dataframe_is_null_pyobj) and simply skips nulls, mirroring the integer/float paths. This is a good fit with the new tests for regular, trailing‑zero, and special Decimals.


2217-2280: Arrow decimal serializers and dispatch cases look correct

The four Arrow decimal serializers:

  • Use _dataframe_arrow_is_valid for null‑bitmask handling, matching other Arrow numeric serializers.
  • Read raw values from buffers[1] with the correct width:
    • 32‑bit: single uint32_t,
    • 64‑bit: single uint64_t,
    • 128‑bit: two uint64_ts per value (offset << 1),
    • 256‑bit: four uint64_ts per value (offset << 2).
  • Apply bswap32/bswap64 and reverse word order for 128/256‑bit values so the mantissa is emitted in big‑endian two’s‑complement form.
  • Pass col.scale and the correct byte lengths (4, 8, 16, 32) to line_sender_buffer_column_dec.
  • Only touch C buffers and line_sender_* functions in the fast path, taking the GIL via _ensure_has_gil only on error, which matches how other Arrow serializers are written and justifies leaving the decimal Arrow sources marked as “no GIL required”.

The additional dispatch cases in _dataframe_serialize_cell correctly route each decimal target/source combination to its respective serializer, so decimal columns fully participate in the existing switch‑style dispatch without falling into the default error branch.

Also applies to: 2439-2448

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
test/system_test.py (1)

269-291: Well-structured test for Python Decimal ingestion.

The test properly validates decimal support for Python decimal.Decimal objects sent via the row() method. Version gating, explicit table creation with DECIMAL(18,3), and verification of both metadata and data are all correctly implemented.

Optionally, consider adding a test case with NULL decimal values to ensure completeness:

sender.row(
    table_name,
    columns={'dec_col': None},  # NULL value
    at=qi.ServerTimestamp)

However, the current basic test is sufficient for initial coverage.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9db429f and 6143de9.

📒 Files selected for processing (1)
  • test/system_test.py (4 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: RaphDal
Repo: questdb/py-questdb-client PR: 114
File: test/system_test.py:317-324
Timestamp: 2025-11-07T15:46:43.284Z
Learning: In QuestDB, when decimal values are ingested, the server creates DECIMAL columns with a default precision of 18 and scale of 3, regardless of the scale declared in the client-side data type (e.g., pyarrow.decimal128(18, 2)). Values are stored and returned with 3 decimal places.
📚 Learning: 2025-11-07T15:46:43.284Z
Learnt from: RaphDal
Repo: questdb/py-questdb-client PR: 114
File: test/system_test.py:317-324
Timestamp: 2025-11-07T15:46:43.284Z
Learning: In QuestDB, when decimal values are ingested, the server creates DECIMAL columns with a default precision of 18 and scale of 3, regardless of the scale declared in the client-side data type (e.g., pyarrow.decimal128(18, 2)). Values are stored and returned with 3 decimal places.

Applied to files:

  • test/system_test.py
🧬 Code graph analysis (1)
test/system_test.py (1)
src/questdb/ingress.pyi (7)
  • Sender (821-1153)
  • row (202-217)
  • row (380-485)
  • row (1030-1049)
  • dataframe (219-230)
  • dataframe (487-769)
  • dataframe (1051-1099)
🔇 Additional comments (4)
test/system_test.py (4)

26-26: Pyarrow import guard correctly addressed.

The pyarrow = None fallback in the exception handler properly addresses the previous review comment about guarding the import. Tests decorated with @unittest.skipIf(not pyarrow, ...) will now skip gracefully when pyarrow is unavailable.


218-218: Version comparison fix correctly applied.

The change from <= to < properly addresses the previous critical review comment. Version 9.1.0 introduced TIMESTAMP_NS, and this comparison now correctly maps versions before 9.1.0 to 'TIMESTAMP' and 9.1.0+ to 'TIMESTAMP_NS'.


293-326: DataFrame decimal ingestion test looks correct.

The test properly validates pandas DataFrame ingestion with pyarrow.decimal128(18, 2) dtype. The expectations are correct:

  • Input values with scale 2 are converted to the table's DECIMAL(18,3) definition
  • Output values are correctly padded to 3 decimal places ('-99999.990', '-678.000')

This aligns with the documented QuestDB behavior where the table's scale takes precedence over the input scale. Based on learnings.


32-36: Version constants correctly updated for decimal support in QuestDB 9.2.0.

QuestDB 9.2.0 introduced native DECIMAL type support, confirming that the QUESTDB_VERSION bump to '9.2.0' and the new FIRST_DECIMAL_RELEASE constant are correctly set. The constant is appropriately used in decimal tests for version gating, aligning with the PR checklist objective.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
src/questdb/dataframe.pxi (1)

1163-1164: ArrowDtype branch still assumes pandas has ArrowDtype; consider guarding or constraining pandas

This elif isinstance(dtype, _PANDAS.ArrowDtype): branch will still raise AttributeError on pandas versions that don’t define ArrowDtype, which then gets wrapped as IngressError(InvalidApiCall, ...) for any dtype. Prior review already suggested either caching ArrowDtype via getattr(pandas, "ArrowDtype", None) in _dataframe_may_import_deps() and guarding the isinstance checks, or tightening the minimum pandas version to one that guarantees ArrowDtype exists and documenting that. The same concern still applies here.

Check which minimum pandas version in your supported range first introduced `pandas.ArrowDtype`, and confirm whether `py-questdb-client` currently enforces at least that version in `pyproject.toml` / packaging metadata. If not enforced, consider adding a guard around `_PANDAS.ArrowDtype` as previously suggested.
🧹 Nitpick comments (4)
setup.py (1)

81-81: Good addition of build dependency tracking.

Explicitly passing the depends parameter ensures the extension is rebuilt when the static library files change. This is a useful improvement to the build system.

The list(extra_objects) copy at line 81 is defensive but unnecessary since extra_objects is already a list and isn't modified afterward. You could simplify to:

-    depends = list(extra_objects)
-
     return Extension(
         "questdb.ingress",
         ["src/questdb/ingress.pyx"],
         include_dirs=[
             "c-questdb-client/include",
             "rpyutils/include",
             np.get_include()],
         library_dirs=lib_paths,
         libraries=libraries,
         extra_compile_args=extra_compile_args,
         extra_link_args=extra_link_args,
         extra_objects=extra_objects,
-        depends=depends,
+        depends=extra_objects,
         define_macros = [
             ('NPY_NO_DEPRECATED_API', 'NPY_1_7_API_VERSION')
         ]
     )

Also applies to: 95-95

rpyutils/include/rpyutils.h (1)

115-124: Document out buffer size and out_size semantics more explicitly

The new API is clear at a type level but its contract is underspecified:

  • Callers aren’t told how large out must be (the Rust impl assumes a fixed 32‑byte buffer).
  • It’s also not stated whether *out_size is purely an output (number of bytes written) or has any in/out meaning.

Consider extending the comment to state that:

  • out must point to a buffer of at least 32 bytes, and
  • on success *out_size is set to the number of valid bytes at the end of that buffer.

This will make the FFI safer for external consumers without changing behavior.

test/test_dataframe.py (2)

11-12: Decimal helpers and pyobj decimal tests look correct and give strong coverage

  • The helpers (DECIMAL_BINARY_FORMAT_TYPE, _decode_decimal_payload, _unwrap_decimal, _decimal_from_unscaled, _decimal_binary_payload) match the documented binary layout and reconstruct values as expected, including sign and arbitrary precision.
  • test_decimal_pyobj_column exercises a wide range of values (including extremes across 64/128/256‑bit ranges and both signs) and directly compares the on‑wire (scale, unscaled) with the Decimal decomposition, which should shake out limb/sign/overflow issues.
  • Trailing‑zero and special‑value tests nicely pin down normalization behavior and the rejection of NaN/±Infinity.

Two small, non‑blocking nits you may consider:

  • In test_decimal_pyobj_column, narrowing the except Exception to qi.IngressError (or re‑raising unexpected exceptions) would address the BLE001 warning and make failures slightly more precise.
  • For _unwrap_decimal, you could simplify the scale logic a bit (e.g. compute unscaled once and derive scale directly from exponent) but the current form is perfectly readable and correct.

Overall, this is high‑value coverage for the new decimal path.

Also applies to: 84-134, 583-637


639-683: Arrow decimal tests correctly validate binary payloads; minor robustness tweak possible

The Arrow decimal cases validate:

  • All four width classes (32/64/128/256) using arrow_type.scale and arrow_type.byte_width.
  • Exact equality of the on‑wire payload to _decimal_binary_payload(unscaled, scale, byte_width).
  • That rows with null/invalid decimals omit the decimal field entirely (only the count field remains).

One minor edge case, only relevant if you later add more than one None in unscaled_values: in the unscaled is None branch you continue without advancing offset, so a subsequent iteration would still search from the previous row’s position. Consider updating offset there as well (e.g. by searching for the bare b' count={count}i\n' suffix) if you extend these tests in future.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6143de9 and 989c4e0.

⛔ Files ignored due to path filters (1)
  • rpyutils/Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (10)
  • rpyutils/Cargo.toml (1 hunks)
  • rpyutils/include/rpyutils.h (1 hunks)
  • rpyutils/src/lib.rs (1 hunks)
  • rpyutils/src/mpd_to_bigendian.rs (1 hunks)
  • setup.py (2 hunks)
  • src/questdb/dataframe.pxi (17 hunks)
  • src/questdb/line_sender.pxd (3 hunks)
  • src/questdb/mpdecimal_compat.pxd (1 hunks)
  • src/questdb/rpyutils.pxd (1 hunks)
  • test/test_dataframe.py (5 hunks)
🧰 Additional context used
🧠 Learnings (3)
📓 Common learnings
Learnt from: RaphDal
Repo: questdb/py-questdb-client PR: 114
File: test/system_test.py:317-324
Timestamp: 2025-11-07T15:46:43.284Z
Learning: In QuestDB, when decimal values are ingested, the server creates DECIMAL columns with a default precision of 18 and scale of 3, regardless of the scale declared in the client-side data type (e.g., pyarrow.decimal128(18, 2)). Values are stored and returned with 3 decimal places.
📚 Learning: 2025-10-22T07:31:37.016Z
Learnt from: RaphDal
Repo: questdb/py-questdb-client PR: 114
File: src/questdb/ingress.pyx:192-195
Timestamp: 2025-10-22T07:31:37.016Z
Learning: In src/questdb/ingress.pyx, the IngressErrorCode.BadDataFrame enum member is Python-internal only and does not map to any C error code from the underlying library. It is defined as `<int>line_sender_error_invalid_decimal + 1` and should not be added to the c_err_code_to_py function, which only handles C-to-Python error code translation.

Applied to files:

  • src/questdb/mpdecimal_compat.pxd
  • src/questdb/dataframe.pxi
  • src/questdb/line_sender.pxd
📚 Learning: 2025-11-07T15:46:43.284Z
Learnt from: RaphDal
Repo: questdb/py-questdb-client PR: 114
File: test/system_test.py:317-324
Timestamp: 2025-11-07T15:46:43.284Z
Learning: In QuestDB, when decimal values are ingested, the server creates DECIMAL columns with a default precision of 18 and scale of 3, regardless of the scale declared in the client-side data type (e.g., pyarrow.decimal128(18, 2)). Values are stored and returned with 3 decimal places.

Applied to files:

  • src/questdb/dataframe.pxi
  • test/test_dataframe.py
🧬 Code graph analysis (4)
rpyutils/include/rpyutils.h (1)
rpyutils/src/mpd_to_bigendian.rs (1)
  • qdb_mpd_to_bigendian (26-48)
rpyutils/src/lib.rs (1)
rpyutils/src/mpd_to_bigendian.rs (1)
  • mpd_to_bigendian (91-109)
rpyutils/src/mpd_to_bigendian.rs (1)
proj.py (1)
  • test (89-95)
test/test_dataframe.py (1)
src/questdb/ingress.pyi (1)
  • IngressError (65-70)
🪛 Ruff (0.14.5)
test/test_dataframe.py

90-90: Avoid specifying long messages outside the exception class

(TRY003)


93-93: Avoid specifying long messages outside the exception class

(TRY003)


95-95: Avoid specifying long messages outside the exception class

(TRY003)


97-97: Avoid specifying long messages outside the exception class

(TRY003)


102-103: Avoid specifying long messages outside the exception class

(TRY003)


614-614: Do not catch blind exception: Exception

(BLE001)

🔇 Additional comments (7)
rpyutils/src/lib.rs (1)

25-27: Module wiring for mpd_to_bigendian looks correct

New module is declared consistently with existing ones and will compile the new FFI implementation into the static library.

rpyutils/Cargo.toml (1)

11-15: Adding i256 as a runtime dependency is reasonable

Using i256 as a dedicated 256‑bit integer type matches the new decimal/limb conversion logic; the manifest change is minimal and scoped.

If you haven’t already, please confirm this exact i256 version matches the one used/validated in the main QuestDB repo and that no extra features are required from the crate.

src/questdb/rpyutils.pxd (1)

62-72: qdb_mpd_to_bigendian Cython declaration matches the C header

The signature (const pointers, uint32_t exp, bint for bool, size_t *out_size) is consistent with rpyutils.h, so the binding surface looks correct.

src/questdb/line_sender.pxd (1)

31-46: New decimal error/enum values and buffer APIs are consistent with existing patterns

Appending line_sender_error_invalid_decimal, introducing line_sender_protocol_version_3, and adding the two decimal column functions all follow the existing FFI style: same bint/err_out pattern, and enum values are extended rather than reordered, so existing codes stay stable.

Please double‑check that:

  • These declarations exactly match questdb/ingress/line_sender.h (names, parameter types, and scale range), and
  • The new line_sender_error_invalid_decimal is correctly mapped to the new Python‐side IngressErrorCode without touching the Python‑only BadDataFrame code path (per the ingress.pyx learning).

Also applies to: 53-57, 268-283

test/test_dataframe.py (1)

1744-1745: Updated error message expectation and protocol v3 test wiring are consistent

  • The new regex "Unsupported arrow type int16 for column 'a'.*github" matches the more descriptive error text and keeps the test aligned with the current user‑facing message.
  • TestPandasProtocolVersionV3 cleanly reuses the existing TestPandasBase.TestPandas mixin, so all DataFrame tests (including the new decimal ones) now run under protocol version 3 as intended.

No further changes needed here.

Also applies to: 1843-1846

rpyutils/src/mpd_to_bigendian.rs (1)

26-48: Core mpdecimal→i256 conversion and FFI wrapper look correct and well‑tested

The limbs reduction, sign handling, 10^exp scaling, trimming, and FFI marshalling all line up with two’s‑complement semantics, and the tests cover the important boundaries (0, ±2²⁵⁵, overflow, varying radix/exp, and FFI error paths). I don’t see correctness or safety issues here; this module is good to go.

Also applies to: 50-109, 111-418

src/questdb/dataframe.pxi (1)

59-74: Decimal dataframe plumbing (targets, sources, dispatch, and serializers) looks coherent

The new decimal target/source enums, _TARGET_TO_SOURCES / _FIELD_TARGETS mappings, col_dispatch_code_t variants, and col_t.scale field all line up, and the serializers use them consistently:

  • Pyobj Decimal columns are detected in _dataframe_series_sniff_pyobj and serialized via serialize_decimal_py_obj, which defers to decimal_pyobj_to_binary and then line_sender_buffer_column_dec.
  • Arrow Decimal32/64/128/256 columns are resolved through _dataframe_series_resolve_arrow, which validates arrowtype.scale in [0, 76], stores it in col.scale, and sets the appropriate col_source_decimal*_arrow.
  • The Arrow decimal serializers correctly respect the validity bitmap, byte‑swap from Arrow’s little‑endian storage to big‑endian, reorder 128/256‑bit limbs, and pass the right widths (4/8/16/32 bytes) plus col.scale into line_sender_buffer_column_dec.
  • Decimal Arrow sources are marked as non‑GIL (702000–705000) while the pyobj source remains GIL‑bound (701100), so the main loop can still run without the GIL when only Arrow decimals are present.

Given the tests in this PR around dataframe/Decimal ingestion, this wiring looks internally consistent and in line with the intended ILP Decimal encoding.

Also applies to: 86-112, 115-157, 173-181, 183-260, 262-273, 287-407, 427-433, 956-979, 1046-1047, 2170-2202, 2204-2265, 2426-2435

RaphDal and others added 2 commits November 21, 2025 12:52
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
test/test_dataframe.py (2)

583-616: Consider catching specific exception types.

The test uses a bare except Exception which can mask unexpected errors. While acceptable in test code for diagnostics, it's better to catch specific exception types like qi.IngressError and AssertionError.

Apply this diff to improve exception handling:

                 try:
                     buf = _dataframe(self.version, df, table_name='tbl', at=qi.ServerTimestamp)
                     (scale, mantissa) = _decode_decimal_payload(buf.splitlines()[0])
                     unscaled = int.from_bytes(mantissa, byteorder='big', signed=True)

                     (expected_scale, expected_unscaled) = _unwrap_decimal(decimal)

                     self.assertEqual(scale, expected_scale)
                     self.assertEqual(unscaled, expected_unscaled)
-                except Exception as ex:
+                except (qi.IngressError, AssertionError) as ex:
                     self.fail(f'Failed to serialize {decimal}: {ex}')

629-638: Consider using assertRaises for clearer test intent.

The test for special values uses a try/except pattern that could mask unexpected errors. Using assertRaises would make the test intent clearer and more robust.

Apply this diff:

         def test_decimal_pyobj_special_values(self):
             if self.version < 3:
                 self.skipTest('decimal datatype requires ILP version 3 or later')
             df = pd.DataFrame({'dec': [Decimal('NaN'), Decimal('Infinity'), Decimal('-Infinity')]})
-            try:
-                _dataframe(self.version, df, table_name='tbl', at=qi.ServerTimestamp)
-                self.fail("special values shouldn't be encoded")
-            except qi.IngressError:
-                pass
+            with self.assertRaises(qi.IngressError):
+                _dataframe(self.version, df, table_name='tbl', at=qi.ServerTimestamp)
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e80cd37 and aeebd96.

📒 Files selected for processing (1)
  • test/test_dataframe.py (5 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: RaphDal
Repo: questdb/py-questdb-client PR: 114
File: test/system_test.py:317-324
Timestamp: 2025-11-07T15:46:43.284Z
Learning: In QuestDB, when decimal values are ingested, the server creates DECIMAL columns with a default precision of 18 and scale of 3, regardless of the scale declared in the client-side data type (e.g., pyarrow.decimal128(18, 2)). Values are stored and returned with 3 decimal places.
📚 Learning: 2025-11-07T15:46:43.284Z
Learnt from: RaphDal
Repo: questdb/py-questdb-client PR: 114
File: test/system_test.py:317-324
Timestamp: 2025-11-07T15:46:43.284Z
Learning: In QuestDB, when decimal values are ingested, the server creates DECIMAL columns with a default precision of 18 and scale of 3, regardless of the scale declared in the client-side data type (e.g., pyarrow.decimal128(18, 2)). Values are stored and returned with 3 decimal places.

Applied to files:

  • test/test_dataframe.py
🧬 Code graph analysis (1)
test/test_dataframe.py (1)
src/questdb/ingress.pyi (1)
  • IngressError (65-70)
🪛 Ruff (0.14.5)
test/test_dataframe.py

90-90: Avoid specifying long messages outside the exception class

(TRY003)


93-93: Avoid specifying long messages outside the exception class

(TRY003)


95-95: Avoid specifying long messages outside the exception class

(TRY003)


97-97: Avoid specifying long messages outside the exception class

(TRY003)


102-103: Avoid specifying long messages outside the exception class

(TRY003)


614-614: Do not catch blind exception: Exception

(BLE001)

🔇 Additional comments (5)
test/test_dataframe.py (5)

11-11: LGTM!

The Decimal import is necessary for the new decimal support functionality.


84-133: LGTM!

The helper functions correctly implement decimal encoding/decoding logic:

  • _decode_decimal_payload properly parses and validates the binary format
  • _unwrap_decimal correctly converts Decimal to unscaled/scale representation
  • _decimal_binary_payload generates the proper big-endian binary format
  • _decimal_from_unscaled reconstructs Decimals using the correct scaleb method

659-703: LGTM!

This test comprehensively covers Arrow decimal types:

  • All four decimal types (decimal32, decimal64, decimal128, decimal256) with appropriate scales
  • Null value handling
  • Binary payload validation
  • Protocol version compatibility

1764-1764: LGTM!

The updated error message pattern is more descriptive, specifying both the arrow type and column name, which improves clarity for users.


1863-1866: LGTM!

The new test class follows the established pattern for protocol version testing and will properly exercise all decimal-related tests with protocol version 3.

@amunra amunra changed the title feat: support decimal feat: decimal support Nov 21, 2025
@amunra amunra merged commit 50ded92 into main Nov 21, 2025
18 checks passed
@amunra amunra deleted the rd_decimal branch November 21, 2025 16:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants